Overview

Dataset statistics

Number of variables34
Number of observations75670
Missing cells85293
Missing cells (%)3.3%
Duplicate rows297
Duplicate rows (%)0.4%
Total size in memory126.4 MiB
Average record size in memory1.7 KiB

Variable types

Categorical18
Numeric7
Text4
DateTime4
Boolean1

Alerts

Compactor has constant value ""Constant
Dataset has 297 (0.4%) duplicate rowsDuplicates
Address is highly overall correlated with Asset Type and 5 other fieldsHigh correlation
Asset Category is highly overall correlated with Charge Code Desc. and 7 other fieldsHigh correlation
Asset Type is highly overall correlated with Address and 3 other fieldsHigh correlation
Charge Code Desc. is highly overall correlated with Asset Category and 3 other fieldsHigh correlation
Charge Group is highly overall correlated with Charge Code Desc. and 3 other fieldsHigh correlation
Charge Sub-Type is highly overall correlated with Address and 3 other fieldsHigh correlation
City is highly overall correlated with Address and 5 other fieldsHigh correlation
Container Size is highly overall correlated with Container Type and 5 other fieldsHigh correlation
Container Type is highly overall correlated with Asset Category and 6 other fieldsHigh correlation
Location ID is highly overall correlated with Address and 4 other fieldsHigh correlation
Original Volume is highly overall correlated with Trash TonsHigh correlation
Pounds per Yard is highly overall correlated with Address and 9 other fieldsHigh correlation
Quantity is highly overall correlated with Charge Group and 1 other fieldsHigh correlation
Recycle Tons is highly overall correlated with Asset Category and 2 other fieldsHigh correlation
Report Period is highly overall correlated with Service PeriodHigh correlation
Service Period is highly overall correlated with Report PeriodHigh correlation
State is highly overall correlated with Address and 2 other fieldsHigh correlation
Tons Derived is highly overall correlated with Asset Category and 6 other fieldsHigh correlation
Trash Tons is highly overall correlated with Original VolumeHigh correlation
UOM for Volume is highly overall correlated with Asset Category and 4 other fieldsHigh correlation
Unit of Measure is highly overall correlated with Charge Code Desc. and 2 other fieldsHigh correlation
Waste Stream is highly overall correlated with Asset Category and 7 other fieldsHigh correlation
Waste Type is highly overall correlated with Asset Category and 6 other fieldsHigh correlation
Container Type is highly imbalanced (85.7%)Imbalance
Asset Category is highly imbalanced (82.8%)Imbalance
Waste Stream is highly imbalanced (89.3%)Imbalance
Waste Type is highly imbalanced (83.0%)Imbalance
UOM for Volume is highly imbalanced (79.8%)Imbalance
Tons Derived is highly imbalanced (74.6%)Imbalance
Compactor has 29161 (38.5%) missing valuesMissing
Charge Sub-Type has 52372 (69.2%) missing valuesMissing
Recycle Tons is highly skewed (γ1 = 22.47100786)Skewed
Trash Tons has 1772 (2.3%) zerosZeros
Recycle Tons has 74338 (98.2%) zerosZeros

Reproduction

Analysis started2023-12-20 11:45:43.777415
Analysis finished2023-12-20 11:46:31.514048
Duration47.74 seconds
Software versionydata-profiling vv4.6.3
Download configurationconfig.json

Variables

Location ID
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
110
7573 
118
6108 
108
5691 
122
 
4353
124
 
4080
Other values (26)
47865 

Length

Max length8
Median length3
Mean length3.1186996
Min length3

Characters and Unicode

Total characters235992
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row101
2nd row101
3rd row101
4th row101
5th row101

Common Values

ValueCountFrequency (%)
110 7573
 
10.0%
118 6108
 
8.1%
108 5691
 
7.5%
122 4353
 
5.8%
124 4080
 
5.4%
103 3768
 
5.0%
119 3617
 
4.8%
116 3481
 
4.6%
117 3297
 
4.4%
104 3186
 
4.2%
Other values (21) 30516
40.3%

Length

2023-12-20T13:46:31.653993image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
110 7573
 
9.6%
118 6108
 
7.8%
108 5691
 
7.2%
122 4353
 
5.5%
124 4080
 
5.2%
103 3768
 
4.8%
119 3617
 
4.6%
116 3481
 
4.4%
117 3297
 
4.2%
104 3186
 
4.1%
Other values (24) 33452
42.6%

Most occurring characters

ValueCountFrequency (%)
1 115943
49.1%
0 31815
 
13.5%
2 24092
 
10.2%
8 11799
 
5.0%
9 11413
 
4.8%
4 8939
 
3.8%
3 7141
 
3.0%
5 5658
 
2.4%
6 5355
 
2.3%
7 4854
 
2.1%
Other values (12) 8983
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 227009
96.2%
Uppercase Letter 6043
 
2.6%
Space Separator 2936
 
1.2%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 115943
51.1%
0 31815
 
14.0%
2 24092
 
10.6%
8 11799
 
5.2%
9 11413
 
5.0%
4 8939
 
3.9%
3 7141
 
3.1%
5 5658
 
2.5%
6 5355
 
2.4%
7 4854
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
C 2932
48.5%
D 2928
48.5%
R 173
 
2.9%
S 4
 
0.1%
G 1
 
< 0.1%
J 1
 
< 0.1%
T 1
 
< 0.1%
E 1
 
< 0.1%
M 1
 
< 0.1%
P 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2936
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 229949
97.4%
Latin 6043
 
2.6%

Most frequent character per script

Common
ValueCountFrequency (%)
1 115943
50.4%
0 31815
 
13.8%
2 24092
 
10.5%
8 11799
 
5.1%
9 11413
 
5.0%
4 8939
 
3.9%
3 7141
 
3.1%
5 5658
 
2.5%
6 5355
 
2.3%
7 4854
 
2.1%
Other values (2) 2940
 
1.3%
Latin
ValueCountFrequency (%)
C 2932
48.5%
D 2928
48.5%
R 173
 
2.9%
S 4
 
0.1%
G 1
 
< 0.1%
J 1
 
< 0.1%
T 1
 
< 0.1%
E 1
 
< 0.1%
M 1
 
< 0.1%
P 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 235992
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 115943
49.1%
0 31815
 
13.5%
2 24092
 
10.2%
8 11799
 
5.0%
9 11413
 
4.8%
4 8939
 
3.8%
3 7141
 
3.0%
5 5658
 
2.4%
6 5355
 
2.3%
7 4854
 
2.1%
Other values (12) 8983
 
3.8%

Invoice Number
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20220622
Minimum20220100
Maximum20221200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 MiB
2023-12-20T13:46:31.821512image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Quantile statistics

Minimum20220100
5-th percentile20220100
Q120220300
median20220600
Q320220900
95-th percentile20221200
Maximum20221200
Range1100
Interquartile range (IQR)600

Descriptive statistics

Standard deviation338.37986
Coefficient of variation (CV)1.6734394 × 10-5
Kurtosis-1.1751507
Mean20220622
Median Absolute Deviation (MAD)300
Skewness0.045384881
Sum1.5300945 × 1012
Variance114500.93
MonotonicityNot monotonic
2023-12-20T13:46:31.992127image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
20220900 8377
11.1%
20220100 7577
10.0%
20220500 7166
9.5%
20220400 6907
9.1%
20220700 6633
8.8%
20220200 6306
8.3%
20220300 6039
8.0%
20220800 5838
7.7%
20220600 5552
7.3%
20221000 5488
7.3%
Other values (2) 9787
12.9%
ValueCountFrequency (%)
20220100 7577
10.0%
20220200 6306
8.3%
20220300 6039
8.0%
20220400 6907
9.1%
20220500 7166
9.5%
20220600 5552
7.3%
20220700 6633
8.8%
20220800 5838
7.7%
20220900 8377
11.1%
20221000 5488
7.3%
ValueCountFrequency (%)
20221200 4807
6.4%
20221100 4980
6.6%
20221000 5488
7.3%
20220900 8377
11.1%
20220800 5838
7.7%
20220700 6633
8.8%
20220600 5552
7.3%
20220500 7166
9.5%
20220400 6907
9.1%
20220300 6039
8.0%
Distinct199
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2023-12-20T13:46:32.389708image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters832370
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row800-0303937
2nd row800-0303937
3rd row800-0303937
4th row800-0303937
5th row800-0303937
ValueCountFrequency (%)
690-8001477 5472
 
7.2%
625-8109512 5126
 
6.8%
535-8004902 4248
 
5.6%
625-8109508 3960
 
5.2%
687-8001871 3408
 
4.5%
625-0125391 3324
 
4.4%
615-8000411 3072
 
4.1%
853-0136012 2928
 
3.9%
535-8004903 2592
 
3.4%
794-8016671 1872
 
2.5%
Other values (189) 39668
52.4%
2023-12-20T13:46:32.955939image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 169438
20.4%
8 98674
11.9%
1 91004
10.9%
5 86798
10.4%
- 75670
9.1%
6 64120
 
7.7%
2 61467
 
7.4%
9 51714
 
6.2%
3 49951
 
6.0%
4 45591
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 756700
90.9%
Dash Punctuation 75670
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 169438
22.4%
8 98674
13.0%
1 91004
12.0%
5 86798
11.5%
6 64120
 
8.5%
2 61467
 
8.1%
9 51714
 
6.8%
3 49951
 
6.6%
4 45591
 
6.0%
7 37943
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 75670
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 832370
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 169438
20.4%
8 98674
11.9%
1 91004
10.9%
5 86798
10.4%
- 75670
9.1%
6 64120
 
7.7%
2 61467
 
7.4%
9 51714
 
6.2%
3 49951
 
6.0%
4 45591
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 832370
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 169438
20.4%
8 98674
11.9%
1 91004
10.9%
5 86798
10.4%
- 75670
9.1%
6 64120
 
7.7%
2 61467
 
7.4%
9 51714
 
6.2%
3 49951
 
6.0%
4 45591
 
5.5%
Distinct209
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size8.1 MiB
2023-12-20T13:46:33.336170image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters1513400
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row800-0303937-00001-01
2nd row800-0303937-00001-01
3rd row800-0303937-00001-01
4th row800-0303937-00001-01
5th row800-0303937-00001-01
ValueCountFrequency (%)
690-8001477-00001-02 5472
 
7.2%
625-8109512-00001-01 5126
 
6.8%
535-8004902-00001-02 4248
 
5.6%
625-8109508-00001-01 3960
 
5.2%
687-8001871-00001-01 3408
 
4.5%
625-0125391-00001-01 3324
 
4.4%
615-8000411-00001-01 3072
 
4.1%
853-0136012-00001-01 2928
 
3.9%
535-8004903-00001-01 2592
 
3.4%
794-8016671-00001-01 1872
 
2.5%
Other values (199) 39668
52.4%
2023-12-20T13:46:33.872509image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 547788
36.2%
1 227222
15.0%
- 227010
15.0%
8 98674
 
6.5%
5 86798
 
5.7%
2 75209
 
5.0%
6 64120
 
4.2%
9 51714
 
3.4%
3 51331
 
3.4%
4 45591
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1286390
85.0%
Dash Punctuation 227010
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 547788
42.6%
1 227222
17.7%
8 98674
 
7.7%
5 86798
 
6.7%
2 75209
 
5.8%
6 64120
 
5.0%
9 51714
 
4.0%
3 51331
 
4.0%
4 45591
 
3.5%
7 37943
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
- 227010
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1513400
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 547788
36.2%
1 227222
15.0%
- 227010
15.0%
8 98674
 
6.5%
5 86798
 
5.7%
2 75209
 
5.0%
6 64120
 
4.2%
9 51714
 
3.4%
3 51331
 
3.4%
4 45591
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1513400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 547788
36.2%
1 227222
15.0%
- 227010
15.0%
8 98674
 
6.5%
5 86798
 
5.7%
2 75209
 
5.0%
6 64120
 
4.2%
9 51714
 
3.4%
3 51331
 
3.4%
4 45591
 
3.0%
Distinct186
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.6 MiB
2023-12-20T13:46:34.116198image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Length

Max length30
Median length27
Mean length25.993947
Min length5

Characters and Unicode

Total characters1966962
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowFLOOR & DECOR 101 30YD CLEANUP
2nd rowFLOOR & DECOR 101 30YD CLEANUP
3rd rowFLOOR & DECOR 101 30YD CLEANUP
4th rowFLOOR & DECOR 101 30YD CLEANUP
5th rowFLOOR & DECOR 101 30YD CLEANUP
ValueCountFrequency (%)
floor 74740
17.1%
decor 74307
17.0%
71234
16.3%
40yd 64259
14.7%
comp 46254
10.6%
110 7573
 
1.7%
118 6108
 
1.4%
108 5691
 
1.3%
122 4352
 
1.0%
124 4075
 
0.9%
Other values (81) 78454
18.0%
2023-12-20T13:46:34.551657image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
361377
18.4%
O 275480
14.0%
D 157917
 
8.0%
R 154233
 
7.8%
C 133252
 
6.8%
1 114526
 
5.8%
0 98323
 
5.0%
E 82575
 
4.2%
L 80033
 
4.1%
F 76577
 
3.9%
Other values (29) 432669
22.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1164059
59.2%
Decimal Number 366298
 
18.6%
Space Separator 361377
 
18.4%
Other Punctuation 74955
 
3.8%
Dash Punctuation 273
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 275480
23.7%
D 157917
13.6%
R 154233
13.2%
C 133252
11.4%
E 82575
 
7.1%
L 80033
 
6.9%
F 76577
 
6.6%
Y 72653
 
6.2%
P 52404
 
4.5%
M 50527
 
4.3%
Other values (16) 28408
 
2.4%
Decimal Number
ValueCountFrequency (%)
1 114526
31.3%
0 98323
26.8%
4 75765
20.7%
2 27119
 
7.4%
8 12565
 
3.4%
9 11413
 
3.1%
3 10337
 
2.8%
6 6861
 
1.9%
7 4854
 
1.3%
5 4535
 
1.2%
Space Separator
ValueCountFrequency (%)
361377
100.0%
Other Punctuation
ValueCountFrequency (%)
& 74955
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 273
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1164059
59.2%
Common 802903
40.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 275480
23.7%
D 157917
13.6%
R 154233
13.2%
C 133252
11.4%
E 82575
 
7.1%
L 80033
 
6.9%
F 76577
 
6.6%
Y 72653
 
6.2%
P 52404
 
4.5%
M 50527
 
4.3%
Other values (16) 28408
 
2.4%
Common
ValueCountFrequency (%)
361377
45.0%
1 114526
 
14.3%
0 98323
 
12.2%
4 75765
 
9.4%
& 74955
 
9.3%
2 27119
 
3.4%
8 12565
 
1.6%
9 11413
 
1.4%
3 10337
 
1.3%
6 6861
 
0.9%
Other values (3) 9662
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1966529
> 99.9%
None 433
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
361377
18.4%
O 275480
14.0%
D 157917
 
8.0%
R 154233
 
7.8%
C 133252
 
6.8%
1 114526
 
5.8%
0 98323
 
5.0%
E 82575
 
4.2%
L 80033
 
4.1%
F 76577
 
3.9%
Other values (28) 432236
22.0%
None
ValueCountFrequency (%)
É 433
100.0%

Address
Categorical

HIGH CORRELATION 

Distinct42
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.1 MiB
3113 E COLONIAL DR
7573 
1980 E COUNTY LINE RD
6108 
1914 W ATLANTIC BLVD
5691 
2525 NW 82ND AVE
 
4353
4 WESTSIDE SHOPPING CENTER
 
4075
Other values (37)
47870 

Length

Max length30
Median length27
Mean length19.351645
Min length14

Characters and Unicode

Total characters1464339
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row1690 NORTHEAST EXPY NE
2nd row1690 NORTHEAST EXPY NE
3rd row1690 NORTHEAST EXPY NE
4th row1690 NORTHEAST EXPY NE
5th row1690 NORTHEAST EXPY NE

Common Values

ValueCountFrequency (%)
3113 E COLONIAL DR 7573
 
10.0%
1980 E COUNTY LINE RD 6108
 
8.1%
1914 W ATLANTIC BLVD 5691
 
7.5%
2525 NW 82ND AVE 4353
 
5.8%
4 WESTSIDE SHOPPING CENTER 4075
 
5.4%
8102 BLANDING BLVD 3768
 
5.0%
125 NW LOOP 410 3617
 
4.8%
21760 US HIGHWAY 19 N 3438
 
4.5%
7350 W 52ND AVE 3297
 
4.4%
2350 ALBERTA DR 3186
 
4.2%
Other values (32) 30564
40.4%

Length

2023-12-20T13:46:34.768335image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
w 21117
 
6.9%
e 17290
 
5.7%
dr 17278
 
5.7%
rd 11311
 
3.7%
nw 9844
 
3.2%
blvd 9459
 
3.1%
pkwy 8433
 
2.8%
ave 7661
 
2.5%
3113 7573
 
2.5%
colonial 7573
 
2.5%
Other values (94) 186738
61.4%

Most occurring characters

ValueCountFrequency (%)
304277
20.8%
N 78774
 
5.4%
E 78679
 
5.4%
1 78194
 
5.3%
L 67640
 
4.6%
R 64947
 
4.4%
D 61528
 
4.2%
0 60227
 
4.1%
A 59560
 
4.1%
W 52756
 
3.6%
Other values (24) 557757
38.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 827697
56.5%
Decimal Number 332365
22.7%
Space Separator 304277
 
20.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 78774
 
9.5%
E 78679
 
9.5%
L 67640
 
8.2%
R 64947
 
7.8%
D 61528
 
7.4%
A 59560
 
7.2%
W 52756
 
6.4%
T 49777
 
6.0%
O 46116
 
5.6%
I 41576
 
5.0%
Other values (13) 226344
27.3%
Decimal Number
ValueCountFrequency (%)
1 78194
23.5%
0 60227
18.1%
2 42891
12.9%
5 41852
12.6%
8 25932
 
7.8%
4 23275
 
7.0%
3 23229
 
7.0%
9 19327
 
5.8%
7 12475
 
3.8%
6 4963
 
1.5%
Space Separator
ValueCountFrequency (%)
304277
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 827697
56.5%
Common 636642
43.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 78774
 
9.5%
E 78679
 
9.5%
L 67640
 
8.2%
R 64947
 
7.8%
D 61528
 
7.4%
A 59560
 
7.2%
W 52756
 
6.4%
T 49777
 
6.0%
O 46116
 
5.6%
I 41576
 
5.0%
Other values (13) 226344
27.3%
Common
ValueCountFrequency (%)
304277
47.8%
1 78194
 
12.3%
0 60227
 
9.5%
2 42891
 
6.7%
5 41852
 
6.6%
8 25932
 
4.1%
4 23275
 
3.7%
3 23229
 
3.6%
9 19327
 
3.0%
7 12475
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1464339
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
304277
20.8%
N 78774
 
5.4%
E 78679
 
5.4%
1 78194
 
5.3%
L 67640
 
4.6%
R 64947
 
4.4%
D 61528
 
4.2%
0 60227
 
4.1%
A 59560
 
4.1%
W 52756
 
3.6%
Other values (24) 557757
38.1%

City
Categorical

HIGH CORRELATION 

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.3 MiB
ORLANDO
7560 
HIGHLANDS RANCH
6108 
HOUSTON
5707 
POMPANO BEACH
5691 
DORAL
 
4353
Other values (28)
46251 

Length

Max length15
Median length12
Mean length8.4003304
Min length5

Characters and Unicode

Total characters635653
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowBROOKHAVEN
2nd rowBROOKHAVEN
3rd rowBROOKHAVEN
4th rowBROOKHAVEN
5th rowBROOKHAVEN

Common Values

ValueCountFrequency (%)
ORLANDO 7560
 
10.0%
HIGHLANDS RANCH 6108
 
8.1%
HOUSTON 5707
 
7.5%
POMPANO BEACH 5691
 
7.5%
DORAL 4353
 
5.8%
GRETNA 4075
 
5.4%
JACKSONVILLE 3768
 
5.0%
SAN ANTONIO 3617
 
4.8%
CLEARWATER 3481
 
4.6%
ARVADA 3297
 
4.4%
Other values (23) 28013
37.0%

Length

2023-12-20T13:46:34.954359image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
orlando 7560
 
8.3%
ranch 6108
 
6.7%
highlands 6108
 
6.7%
houston 5707
 
6.3%
beach 5692
 
6.2%
pompano 5691
 
6.2%
doral 4353
 
4.8%
gretna 4075
 
4.5%
jacksonville 3768
 
4.1%
antonio 3617
 
4.0%
Other values (27) 38421
42.2%

Most occurring characters

ValueCountFrequency (%)
A 94288
14.8%
N 75562
11.9%
O 68434
10.8%
L 48272
 
7.6%
R 41614
 
6.5%
E 38474
 
6.1%
H 34221
 
5.4%
D 31217
 
4.9%
S 29534
 
4.6%
T 28271
 
4.4%
Other values (15) 145766
22.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 620223
97.6%
Space Separator 15430
 
2.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 94288
15.2%
N 75562
12.2%
O 68434
11.0%
L 48272
 
7.8%
R 41614
 
6.7%
E 38474
 
6.2%
H 34221
 
5.5%
D 31217
 
5.0%
S 29534
 
4.8%
T 28271
 
4.6%
Other values (14) 130336
21.0%
Space Separator
ValueCountFrequency (%)
15430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 620223
97.6%
Common 15430
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 94288
15.2%
N 75562
12.2%
O 68434
11.0%
L 48272
 
7.8%
R 41614
 
6.7%
E 38474
 
6.2%
H 34221
 
5.5%
D 31217
 
5.0%
S 29534
 
4.8%
T 28271
 
4.6%
Other values (14) 130336
21.0%
Common
ValueCountFrequency (%)
15430
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 635653
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 94288
14.8%
N 75562
11.9%
O 68434
10.8%
L 48272
 
7.6%
R 41614
 
6.5%
E 38474
 
6.1%
H 34221
 
5.4%
D 31217
 
4.9%
S 29534
 
4.6%
T 28271
 
4.4%
Other values (15) 145766
22.9%

State
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
FL
26532 
TX
22433 
CO
9405 
AZ
4399 
LA
4075 
Other values (6)
8826 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters151340
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowGA
2nd rowGA
3rd rowGA
4th rowGA
5th rowGA

Common Values

ValueCountFrequency (%)
FL 26532
35.1%
TX 22433
29.6%
CO 9405
 
12.4%
AZ 4399
 
5.8%
LA 4075
 
5.4%
GA 3140
 
4.1%
NV 2481
 
3.3%
CA 1691
 
2.2%
OH 1507
 
2.0%
TN 6
 
< 0.1%

Length

2023-12-20T13:46:35.124667image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fl 26532
35.1%
tx 22433
29.6%
co 9405
 
12.4%
az 4399
 
5.8%
la 4075
 
5.4%
ga 3140
 
4.1%
nv 2481
 
3.3%
ca 1691
 
2.2%
oh 1507
 
2.0%
tn 6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
L 30607
20.2%
F 26532
17.5%
T 22439
14.8%
X 22433
14.8%
A 13305
8.8%
C 11096
 
7.3%
O 10912
 
7.2%
Z 4399
 
2.9%
G 3140
 
2.1%
N 2487
 
1.6%
Other values (4) 3990
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 151340
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 30607
20.2%
F 26532
17.5%
T 22439
14.8%
X 22433
14.8%
A 13305
8.8%
C 11096
 
7.3%
O 10912
 
7.2%
Z 4399
 
2.9%
G 3140
 
2.1%
N 2487
 
1.6%
Other values (4) 3990
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 151340
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 30607
20.2%
F 26532
17.5%
T 22439
14.8%
X 22433
14.8%
A 13305
8.8%
C 11096
 
7.3%
O 10912
 
7.2%
Z 4399
 
2.9%
G 3140
 
2.1%
N 2487
 
1.6%
Other values (4) 3990
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 151340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 30607
20.2%
F 26532
17.5%
T 22439
14.8%
X 22433
14.8%
A 13305
8.8%
C 11096
 
7.3%
O 10912
 
7.2%
Z 4399
 
2.9%
G 3140
 
2.1%
N 2487
 
1.6%
Other values (4) 3990
 
2.6%

Zip Code
Real number (ℝ)

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60175.899
Minimum30144
Maximum92860
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 MiB
2023-12-20T13:46:35.523725image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Quantile statistics

Minimum30144
5-th percentile32244
Q133069
median75229
Q378759
95-th percentile89014
Maximum92860
Range62716
Interquartile range (IQR)45690

Descriptive statistics

Standard deviation22987.141
Coefficient of variation (CV)0.38199912
Kurtosis-1.7744059
Mean60175.899
Median Absolute Deviation (MAD)10079
Skewness-0.27261109
Sum4.5535103 × 109
Variance5.2840863 × 108
MonotonicityNot monotonic
2023-12-20T13:46:35.690383image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
32803 7573
 
10.0%
80126 6108
 
8.1%
33069 5691
 
7.5%
33122 4353
 
5.8%
70053 4075
 
5.4%
32244 3768
 
5.0%
78216 3617
 
4.8%
33765 3481
 
4.6%
80002 3297
 
4.4%
75229 3186
 
4.2%
Other values (23) 30521
40.3%
ValueCountFrequency (%)
30144 1874
 
2.5%
30260 571
 
0.8%
30329 695
 
0.9%
32244 3768
5.0%
32803 7573
10.0%
33069 5691
7.5%
33122 4353
5.8%
33183 5
 
< 0.1%
33426 1
 
< 0.1%
33619 1669
 
2.2%
ValueCountFrequency (%)
92860 1691
 
2.2%
89014 2481
3.3%
85308 2709
3.6%
85283 1668
 
2.2%
85038 9
 
< 0.1%
80126 6108
8.1%
80002 3297
4.4%
78759 2783
3.7%
78216 3617
4.8%
77523 2928
3.9%

Container Type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
RO
71986 
SC
 
1743
FL
 
792
FR
 
756
HP
 
389

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters151340
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRO
2nd rowRO
3rd rowRO
4th rowRO
5th rowRO

Common Values

ValueCountFrequency (%)
RO 71986
95.1%
SC 1743
 
2.3%
FL 792
 
1.0%
FR 756
 
1.0%
HP 389
 
0.5%
IR 4
 
< 0.1%

Length

2023-12-20T13:46:35.874983image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:36.032079image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
ro 71986
95.1%
sc 1743
 
2.3%
fl 792
 
1.0%
fr 756
 
1.0%
hp 389
 
0.5%
ir 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
R 72746
48.1%
O 71986
47.6%
S 1743
 
1.2%
C 1743
 
1.2%
F 1548
 
1.0%
L 792
 
0.5%
H 389
 
0.3%
P 389
 
0.3%
I 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 151340
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 72746
48.1%
O 71986
47.6%
S 1743
 
1.2%
C 1743
 
1.2%
F 1548
 
1.0%
L 792
 
0.5%
H 389
 
0.3%
P 389
 
0.3%
I 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 151340
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 72746
48.1%
O 71986
47.6%
S 1743
 
1.2%
C 1743
 
1.2%
F 1548
 
1.0%
L 792
 
0.5%
H 389
 
0.3%
P 389
 
0.3%
I 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 151340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 72746
48.1%
O 71986
47.6%
S 1743
 
1.2%
C 1743
 
1.2%
F 1548
 
1.0%
L 792
 
0.5%
H 389
 
0.3%
P 389
 
0.3%
I 4
 
< 0.1%

Container Size
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.446121
Minimum0
Maximum42
Zeros389
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size3.2 MiB
2023-12-20T13:46:36.177533image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q140
median40
Q340
95-th percentile40
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.082431
Coefficient of variation (CV)0.15820662
Kurtosis22.27431
Mean38.446121
Median Absolute Deviation (MAD)0
Skewness-4.6442891
Sum2909218
Variance36.995967
MonotonicityNot monotonic
2023-12-20T13:46:36.322859image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
40 66616
88.0%
30 3895
 
5.1%
42 1512
 
2.0%
36 1272
 
1.7%
8 756
 
1.0%
4 564
 
0.7%
20 438
 
0.6%
0 389
 
0.5%
6 228
 
0.3%
ValueCountFrequency (%)
0 389
 
0.5%
4 564
 
0.7%
6 228
 
0.3%
8 756
 
1.0%
20 438
 
0.6%
30 3895
 
5.1%
36 1272
 
1.7%
40 66616
88.0%
42 1512
 
2.0%
ValueCountFrequency (%)
42 1512
 
2.0%
40 66616
88.0%
36 1272
 
1.7%
30 3895
 
5.1%
20 438
 
0.6%
8 756
 
1.0%
6 228
 
0.3%
4 564
 
0.7%
0 389
 
0.5%

Compactor
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing29161
Missing (%)38.5%
Memory size6.3 MiB
C
46509 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters46509
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowC
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
C 46509
61.5%
(Missing) 29161
38.5%

Length

2023-12-20T13:46:36.475905image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:36.600541image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
c 46509
100.0%

Most occurring characters

ValueCountFrequency (%)
C 46509
100.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 46509
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 46509
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 46509
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 46509
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46509
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 46509
100.0%

Asset Type
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.4 MiB
PERMANENT
56614 
TEMPORARY
19056 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters681030
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTEMPORARY
2nd rowTEMPORARY
3rd rowTEMPORARY
4th rowTEMPORARY
5th rowTEMPORARY

Common Values

ValueCountFrequency (%)
PERMANENT 56614
74.8%
TEMPORARY 19056
 
25.2%

Length

2023-12-20T13:46:36.736925image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:36.879879image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
permanent 56614
74.8%
temporary 19056
 
25.2%

Most occurring characters

ValueCountFrequency (%)
E 132284
19.4%
N 113228
16.6%
R 94726
13.9%
P 75670
11.1%
M 75670
11.1%
A 75670
11.1%
T 75670
11.1%
O 19056
 
2.8%
Y 19056
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 681030
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 132284
19.4%
N 113228
16.6%
R 94726
13.9%
P 75670
11.1%
M 75670
11.1%
A 75670
11.1%
T 75670
11.1%
O 19056
 
2.8%
Y 19056
 
2.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 681030
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 132284
19.4%
N 113228
16.6%
R 94726
13.9%
P 75670
11.1%
M 75670
11.1%
A 75670
11.1%
T 75670
11.1%
O 19056
 
2.8%
Y 19056
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 681030
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 132284
19.4%
N 113228
16.6%
R 94726
13.9%
P 75670
11.1%
M 75670
11.1%
A 75670
11.1%
T 75670
11.1%
O 19056
 
2.8%
Y 19056
 
2.8%

Asset Category
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.4 MiB
Industrial
73733 
Commercial
 
1937

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters756700
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIndustrial
2nd rowIndustrial
3rd rowIndustrial
4th rowIndustrial
5th rowIndustrial

Common Values

ValueCountFrequency (%)
Industrial 73733
97.4%
Commercial 1937
 
2.6%

Length

2023-12-20T13:46:37.021566image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:37.163175image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
industrial 73733
97.4%
commercial 1937
 
2.6%

Most occurring characters

ValueCountFrequency (%)
r 75670
10.0%
i 75670
10.0%
a 75670
10.0%
l 75670
10.0%
I 73733
9.7%
n 73733
9.7%
d 73733
9.7%
u 73733
9.7%
s 73733
9.7%
t 73733
9.7%
Other values (5) 11622
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 681030
90.0%
Uppercase Letter 75670
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 75670
11.1%
i 75670
11.1%
a 75670
11.1%
l 75670
11.1%
n 73733
10.8%
d 73733
10.8%
u 73733
10.8%
s 73733
10.8%
t 73733
10.8%
m 3874
 
0.6%
Other values (3) 5811
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
I 73733
97.4%
C 1937
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 756700
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 75670
10.0%
i 75670
10.0%
a 75670
10.0%
l 75670
10.0%
I 73733
9.7%
n 73733
9.7%
d 73733
9.7%
u 73733
9.7%
s 73733
9.7%
t 73733
9.7%
Other values (5) 11622
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 756700
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 75670
10.0%
i 75670
10.0%
a 75670
10.0%
l 75670
10.0%
I 73733
9.7%
n 73733
9.7%
d 73733
9.7%
u 73733
9.7%
s 73733
9.7%
t 73733
9.7%
Other values (5) 11622
 
1.5%

Waste Stream
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.1 MiB
Trash
74605 
Recycle
 
1065

Length

Max length7
Median length5
Mean length5.0281485
Min length5

Characters and Unicode

Total characters380480
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTrash
2nd rowTrash
3rd rowTrash
4th rowTrash
5th rowTrash

Common Values

ValueCountFrequency (%)
Trash 74605
98.6%
Recycle 1065
 
1.4%

Length

2023-12-20T13:46:37.337999image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:37.521296image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
trash 74605
98.6%
recycle 1065
 
1.4%

Most occurring characters

ValueCountFrequency (%)
T 74605
19.6%
r 74605
19.6%
a 74605
19.6%
s 74605
19.6%
h 74605
19.6%
e 2130
 
0.6%
c 2130
 
0.6%
R 1065
 
0.3%
y 1065
 
0.3%
l 1065
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 304810
80.1%
Uppercase Letter 75670
 
19.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 74605
24.5%
a 74605
24.5%
s 74605
24.5%
h 74605
24.5%
e 2130
 
0.7%
c 2130
 
0.7%
y 1065
 
0.3%
l 1065
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
T 74605
98.6%
R 1065
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 380480
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 74605
19.6%
r 74605
19.6%
a 74605
19.6%
s 74605
19.6%
h 74605
19.6%
e 2130
 
0.6%
c 2130
 
0.6%
R 1065
 
0.3%
y 1065
 
0.3%
l 1065
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 380480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 74605
19.6%
r 74605
19.6%
a 74605
19.6%
s 74605
19.6%
h 74605
19.6%
e 2130
 
0.6%
c 2130
 
0.6%
R 1065
 
0.3%
y 1065
 
0.3%
l 1065
 
0.3%

Waste Type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.1 MiB
NONE
70973 
CONSTRUCTION/DEMOLITION DEBRIS
 
3664
SINGLE STREAM RECYCLING
 
756
WOOD
 
266
METAL
 
11

Length

Max length30
Median length4
Mean length5.4489097
Min length4

Characters and Unicode

Total characters412319
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNONE
2nd rowNONE
3rd rowNONE
4th rowNONE
5th rowNONE

Common Values

ValueCountFrequency (%)
NONE 70973
93.8%
CONSTRUCTION/DEMOLITION DEBRIS 3664
 
4.8%
SINGLE STREAM RECYCLING 756
 
1.0%
WOOD 266
 
0.4%
METAL 11
 
< 0.1%

Length

2023-12-20T13:46:37.663460image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:37.815743image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
none 70973
87.8%
construction/demolition 3664
 
4.5%
debris 3664
 
4.5%
single 756
 
0.9%
stream 756
 
0.9%
recycling 756
 
0.9%
wood 266
 
0.3%
metal 11
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 154450
37.5%
O 86161
20.9%
E 80580
19.5%
I 16168
 
3.9%
T 11759
 
2.9%
C 8840
 
2.1%
S 8840
 
2.1%
R 8840
 
2.1%
D 7594
 
1.8%
L 5187
 
1.3%
Other values (9) 23900
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 403479
97.9%
Space Separator 5176
 
1.3%
Other Punctuation 3664
 
0.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 154450
38.3%
O 86161
21.4%
E 80580
20.0%
I 16168
 
4.0%
T 11759
 
2.9%
C 8840
 
2.2%
S 8840
 
2.2%
R 8840
 
2.2%
D 7594
 
1.9%
L 5187
 
1.3%
Other values (7) 15060
 
3.7%
Space Separator
ValueCountFrequency (%)
5176
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3664
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 403479
97.9%
Common 8840
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 154450
38.3%
O 86161
21.4%
E 80580
20.0%
I 16168
 
4.0%
T 11759
 
2.9%
C 8840
 
2.2%
S 8840
 
2.2%
R 8840
 
2.2%
D 7594
 
1.9%
L 5187
 
1.3%
Other values (7) 15060
 
3.7%
Common
ValueCountFrequency (%)
5176
58.6%
/ 3664
41.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 412319
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 154450
37.5%
O 86161
20.9%
E 80580
19.5%
I 16168
 
3.9%
T 11759
 
2.9%
C 8840
 
2.1%
S 8840
 
2.1%
R 8840
 
2.1%
D 7594
 
1.8%
L 5187
 
1.3%
Other values (9) 23900
 
5.8%
Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.2 MiB
Minimum2022-01-01 00:00:00
Maximum2022-12-01 00:00:00
2023-12-20T13:46:37.969524image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:38.119887image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
Distinct392
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 MiB
Minimum2021-08-12 00:00:00
Maximum2023-01-01 00:00:00
2023-12-20T13:46:38.337664image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:38.538352image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3436
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size7.3 MiB
2023-12-20T13:46:38.974127image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Length

Max length11
Median length8
Mean length7.5566275
Min length6

Characters and Unicode

Total characters571810
Distinct characters16
Distinct categories6 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique364 ?
Unique (%)0.5%

Sample

1st row$77.25
2nd row$77.25
3rd row$77.25
4th row$77.25
5th row$77.25
ValueCountFrequency (%)
0.00 2814
 
3.7%
152.77 2208
 
2.9%
280.68 1608
 
2.1%
79.57 1519
 
2.0%
290.24 1332
 
1.8%
569.22 1276
 
1.7%
185.66 1225
 
1.6%
602.50 1104
 
1.5%
330.63 1085
 
1.4%
550.75 957
 
1.3%
Other values (3365) 60542
80.0%
2023-12-20T13:46:39.594374image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
$ 75670
13.2%
. 75670
13.2%
74967
13.1%
2 45465
8.0%
0 43560
7.6%
1 43151
7.5%
5 35933
6.3%
6 31965
 
5.6%
7 30061
 
5.3%
8 29540
 
5.2%
Other values (6) 85828
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 342637
59.9%
Other Punctuation 77130
 
13.5%
Currency Symbol 75670
 
13.2%
Space Separator 74967
 
13.1%
Open Punctuation 703
 
0.1%
Close Punctuation 703
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 45465
13.3%
0 43560
12.7%
1 43151
12.6%
5 35933
10.5%
6 31965
9.3%
7 30061
8.8%
8 29540
8.6%
3 28989
8.5%
4 27367
8.0%
9 26606
7.8%
Other Punctuation
ValueCountFrequency (%)
. 75670
98.1%
, 1460
 
1.9%
Currency Symbol
ValueCountFrequency (%)
$ 75670
100.0%
Space Separator
ValueCountFrequency (%)
74967
100.0%
Open Punctuation
ValueCountFrequency (%)
( 703
100.0%
Close Punctuation
ValueCountFrequency (%)
) 703
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 571810
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 75670
13.2%
. 75670
13.2%
74967
13.1%
2 45465
8.0%
0 43560
7.6%
1 43151
7.5%
5 35933
6.3%
6 31965
 
5.6%
7 30061
 
5.3%
8 29540
 
5.2%
Other values (6) 85828
15.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 571810
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 75670
13.2%
. 75670
13.2%
74967
13.1%
2 45465
8.0%
0 43560
7.6%
1 43151
7.5%
5 35933
6.3%
6 31965
 
5.6%
7 30061
 
5.3%
8 29540
 
5.2%
Other values (6) 85828
15.0%

Charge Code Desc.
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
DISPOSAL CHARGE
26206 
HAUL
24424 
FEE (FRANCHISE)
4271 
CONTAINER/COMPACTOR RENTAL
4228 
STATE TAX
3890 
Other values (23)
12651 

Length

Max length28
Median length26
Mean length11.487023
Min length3

Characters and Unicode

Total characters869223
Distinct characters28
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowCONTAINER/COMPACTOR RENTAL
2nd rowCONTAINER/COMPACTOR RENTAL
3rd rowCONTAINER/COMPACTOR RENTAL
4th rowCONTAINER/COMPACTOR RENTAL
5th rowCONTAINER/COMPACTOR RENTAL

Common Values

ValueCountFrequency (%)
DISPOSAL CHARGE 26206
34.6%
HAUL 24424
32.3%
FEE (FRANCHISE) 4271
 
5.6%
CONTAINER/COMPACTOR RENTAL 4228
 
5.6%
STATE TAX 3890
 
5.1%
CITY TAX 2243
 
3.0%
SERVICE ATTEMPT 2049
 
2.7%
COUNTY TAX 1993
 
2.6%
MTA TAX 1804
 
2.4%
HAUL/MONTHLY STANDARD SVC 992
 
1.3%
Other values (18) 3570
 
4.7%

Length

2023-12-20T13:46:39.816136image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
charge 26390
20.4%
disposal 26206
20.2%
haul 24424
18.8%
tax 10215
 
7.9%
fee 6010
 
4.6%
franchise 4271
 
3.3%
container/compactor 4236
 
3.3%
rental 4228
 
3.3%
state 3890
 
3.0%
city 2243
 
1.7%
Other values (31) 17483
13.5%

Most occurring characters

ValueCountFrequency (%)
A 118892
13.7%
S 67993
 
7.8%
E 65547
 
7.5%
L 59634
 
6.9%
H 58053
 
6.7%
53926
 
6.2%
C 52389
 
6.0%
T 49417
 
5.7%
R 49034
 
5.6%
I 44380
 
5.1%
Other values (18) 249958
28.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 797989
91.8%
Space Separator 53926
 
6.2%
Close Punctuation 6010
 
0.7%
Open Punctuation 6010
 
0.7%
Other Punctuation 5288
 
0.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 118892
14.9%
S 67993
 
8.5%
E 65547
 
8.2%
L 59634
 
7.5%
H 58053
 
7.3%
C 52389
 
6.6%
T 49417
 
6.2%
R 49034
 
6.1%
I 44380
 
5.6%
O 43322
 
5.4%
Other values (14) 189328
23.7%
Space Separator
ValueCountFrequency (%)
53926
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6010
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6010
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 5288
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 797989
91.8%
Common 71234
 
8.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 118892
14.9%
S 67993
 
8.5%
E 65547
 
8.2%
L 59634
 
7.5%
H 58053
 
7.3%
C 52389
 
6.6%
T 49417
 
6.2%
R 49034
 
6.1%
I 44380
 
5.6%
O 43322
 
5.4%
Other values (14) 189328
23.7%
Common
ValueCountFrequency (%)
53926
75.7%
) 6010
 
8.4%
( 6010
 
8.4%
/ 5288
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 869223
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 118892
13.7%
S 67993
 
7.8%
E 65547
 
7.5%
L 59634
 
6.9%
H 58053
 
6.7%
53926
 
6.2%
C 52389
 
6.0%
T 49417
 
5.7%
R 49034
 
5.6%
I 44380
 
5.1%
Other values (18) 249958
28.8%
Distinct399
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 MiB
Minimum2021-08-12 00:00:00
Maximum2023-01-01 00:00:00
2023-12-20T13:46:40.049747image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:40.304365image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct393
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 MiB
Minimum2021-08-12 00:00:00
Maximum2023-01-31 00:00:00
2023-12-20T13:46:40.507201image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:40.723927image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Quantity
Real number (ℝ)

HIGH CORRELATION 

Distinct945
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8239264
Minimum0.01
Maximum40
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 MiB
2023-12-20T13:46:40.938579image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile1
Q11
median1
Q33.57
95-th percentile9.18
Maximum40
Range39.99
Interquartile range (IQR)2.57

Descriptive statistics

Standard deviation3.9464495
Coefficient of variation (CV)1.3975044
Kurtosis37.643446
Mean2.8239264
Median Absolute Deviation (MAD)0
Skewness5.0206317
Sum213686.51
Variance15.574463
MonotonicityNot monotonic
2023-12-20T13:46:41.211385image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 46517
61.5%
4 419
 
0.6%
40 299
 
0.4%
30 280
 
0.4%
4.18 150
 
0.2%
2.34 129
 
0.2%
9.18 124
 
0.2%
3.69 122
 
0.2%
3.11 120
 
0.2%
2.8 119
 
0.2%
Other values (935) 27391
36.2%
ValueCountFrequency (%)
0.01 41
0.1%
0.04 12
 
< 0.1%
0.06 1
 
< 0.1%
0.07 23
< 0.1%
0.09 3
 
< 0.1%
0.1 21
< 0.1%
0.15 12
 
< 0.1%
0.16 12
 
< 0.1%
0.17 3
 
< 0.1%
0.18 3
 
< 0.1%
ValueCountFrequency (%)
40 299
0.4%
30 280
0.4%
21.38 12
 
< 0.1%
19.18 11
 
< 0.1%
16.3 36
 
< 0.1%
15.85 12
 
< 0.1%
15.77 24
 
< 0.1%
14.88 12
 
< 0.1%
14.84 12
 
< 0.1%
14.42 12
 
< 0.1%

Unit of Measure
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
EA
45689 
TN
29390 
YD
 
591

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters151340
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEA
2nd rowEA
3rd rowEA
4th rowEA
5th rowEA

Common Values

ValueCountFrequency (%)
EA 45689
60.4%
TN 29390
38.8%
YD 591
 
0.8%

Length

2023-12-20T13:46:41.507217image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:41.716100image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
ea 45689
60.4%
tn 29390
38.8%
yd 591
 
0.8%

Most occurring characters

ValueCountFrequency (%)
E 45689
30.2%
A 45689
30.2%
T 29390
19.4%
N 29390
19.4%
Y 591
 
0.4%
D 591
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 151340
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 45689
30.2%
A 45689
30.2%
T 29390
19.4%
N 29390
19.4%
Y 591
 
0.4%
D 591
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 151340
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 45689
30.2%
A 45689
30.2%
T 29390
19.4%
N 29390
19.4%
Y 591
 
0.4%
D 591
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 151340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 45689
30.2%
A 45689
30.2%
T 29390
19.4%
N 29390
19.4%
Y 591
 
0.4%
D 591
 
0.4%

Charge Group
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 MiB
DSP
26206 
HAUL
25416 
TAX
10215 
MISC
7259 
FEE
6010 

Length

Max length4
Median length3
Mean length3.4318092
Min length3

Characters and Unicode

Total characters259685
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMISC
2nd rowMISC
3rd rowMISC
4th rowMISC
5th rowMISC

Common Values

ValueCountFrequency (%)
DSP 26206
34.6%
HAUL 25416
33.6%
TAX 10215
 
13.5%
MISC 7259
 
9.6%
FEE 6010
 
7.9%
STD 564
 
0.7%

Length

2023-12-20T13:46:42.033768image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:42.291274image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
dsp 26206
34.6%
haul 25416
33.6%
tax 10215
 
13.5%
misc 7259
 
9.6%
fee 6010
 
7.9%
std 564
 
0.7%

Most occurring characters

ValueCountFrequency (%)
A 35631
13.7%
S 34029
13.1%
D 26770
10.3%
P 26206
10.1%
H 25416
9.8%
U 25416
9.8%
L 25416
9.8%
E 12020
 
4.6%
T 10779
 
4.2%
X 10215
 
3.9%
Other values (4) 27787
10.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 259685
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 35631
13.7%
S 34029
13.1%
D 26770
10.3%
P 26206
10.1%
H 25416
9.8%
U 25416
9.8%
L 25416
9.8%
E 12020
 
4.6%
T 10779
 
4.2%
X 10215
 
3.9%
Other values (4) 27787
10.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 259685
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 35631
13.7%
S 34029
13.1%
D 26770
10.3%
P 26206
10.1%
H 25416
9.8%
U 25416
9.8%
L 25416
9.8%
E 12020
 
4.6%
T 10779
 
4.2%
X 10215
 
3.9%
Other values (4) 27787
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259685
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 35631
13.7%
S 34029
13.1%
D 26770
10.3%
P 26206
10.1%
H 25416
9.8%
U 25416
9.8%
L 25416
9.8%
E 12020
 
4.6%
T 10779
 
4.2%
X 10215
 
3.9%
Other values (4) 27787
10.7%

Charge Sub-Type
Categorical

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)0.1%
Missing52372
Missing (%)69.2%
Memory size5.9 MiB
FRAN
4271 
RENT
4228 
STAT
3890 
CITY
2243 
TRIP
2049 
Other values (14)
6617 

Length

Max length4
Median length4
Mean length3.9587947
Min length3

Characters and Unicode

Total characters92232
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowRENT
2nd rowRENT
3rd rowRENT
4th rowRENT
5th rowRENT

Common Values

ValueCountFrequency (%)
FRAN 4271
 
5.6%
RENT 4228
 
5.6%
STAT 3890
 
5.1%
CITY 2243
 
3.0%
TRIP 2049
 
2.7%
CTAX 1993
 
2.6%
MTAX 1804
 
2.4%
MLFT 841
 
1.1%
EXSS 501
 
0.7%
LEGS 454
 
0.6%
Other values (9) 1024
 
1.4%
(Missing) 52372
69.2%

Length

2023-12-20T13:46:42.588464image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fran 4271
18.3%
rent 4228
18.1%
stat 3890
16.7%
city 2243
9.6%
trip 2049
8.8%
ctax 1993
8.6%
mtax 1804
7.7%
mlft 841
 
3.6%
exss 501
 
2.2%
legs 454
 
1.9%
Other values (9) 1024
 
4.4%

Most occurring characters

ValueCountFrequency (%)
T 21223
23.0%
A 12683
13.8%
R 10611
11.5%
N 8499
9.2%
E 5472
 
5.9%
S 5346
 
5.8%
F 5112
 
5.5%
X 4599
 
5.0%
I 4292
 
4.7%
C 4236
 
4.6%
Other values (11) 10159
11.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 92232
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 21223
23.0%
A 12683
13.8%
R 10611
11.5%
N 8499
9.2%
E 5472
 
5.9%
S 5346
 
5.8%
F 5112
 
5.5%
X 4599
 
5.0%
I 4292
 
4.7%
C 4236
 
4.6%
Other values (11) 10159
11.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 92232
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 21223
23.0%
A 12683
13.8%
R 10611
11.5%
N 8499
9.2%
E 5472
 
5.9%
S 5346
 
5.8%
F 5112
 
5.5%
X 4599
 
5.0%
I 4292
 
4.7%
C 4236
 
4.6%
Other values (11) 10159
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 92232
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 21223
23.0%
A 12683
13.8%
R 10611
11.5%
N 8499
9.2%
E 5472
 
5.9%
S 5346
 
5.8%
F 5112
 
5.5%
X 4599
 
5.0%
I 4292
 
4.7%
C 4236
 
4.6%
Other values (11) 10159
11.0%

Report Period
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)< 0.1%
Missing470
Missing (%)0.6%
Memory size7.1 MiB
Jul-22
7177 
Jun-22
6809 
Aug-22
6503 
May-22
6500 
Apr-22
6491 
Other values (7)
41720 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters451200
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDec-22
2nd rowNov-22
3rd rowOct-22
4th rowSep-22
5th rowAug-22

Common Values

ValueCountFrequency (%)
Jul-22 7177
9.5%
Jun-22 6809
9.0%
Aug-22 6503
8.6%
May-22 6500
8.6%
Apr-22 6491
8.6%
Sep-22 6268
8.3%
Mar-22 6247
8.3%
Feb-22 6056
8.0%
Jan-22 6038
8.0%
Oct-22 5878
7.8%
Other values (2) 11233
14.8%

Length

2023-12-20T13:46:42.849462image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
jul-22 7177
9.5%
jun-22 6809
9.1%
aug-22 6503
8.6%
may-22 6500
8.6%
apr-22 6491
8.6%
sep-22 6268
8.3%
mar-22 6247
8.3%
feb-22 6056
8.1%
jan-22 6038
8.0%
oct-22 5878
7.8%
Other values (2) 11233
14.9%

Most occurring characters

ValueCountFrequency (%)
2 150400
33.3%
- 75200
16.7%
u 20489
 
4.5%
J 20024
 
4.4%
a 18785
 
4.2%
e 18120
 
4.0%
A 12994
 
2.9%
n 12847
 
2.8%
p 12759
 
2.8%
M 12747
 
2.8%
Other values (14) 96835
21.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 150400
33.3%
Lowercase Letter 150400
33.3%
Dash Punctuation 75200
16.7%
Uppercase Letter 75200
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 20489
13.6%
a 18785
12.5%
e 18120
12.0%
n 12847
8.5%
p 12759
8.5%
r 12738
8.5%
c 11674
7.8%
l 7177
 
4.8%
g 6503
 
4.3%
y 6500
 
4.3%
Other values (4) 22808
15.2%
Uppercase Letter
ValueCountFrequency (%)
J 20024
26.6%
A 12994
17.3%
M 12747
17.0%
S 6268
 
8.3%
F 6056
 
8.1%
O 5878
 
7.8%
D 5796
 
7.7%
N 5437
 
7.2%
Decimal Number
ValueCountFrequency (%)
2 150400
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 75200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 225600
50.0%
Latin 225600
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 20489
 
9.1%
J 20024
 
8.9%
a 18785
 
8.3%
e 18120
 
8.0%
A 12994
 
5.8%
n 12847
 
5.7%
p 12759
 
5.7%
M 12747
 
5.7%
r 12738
 
5.6%
c 11674
 
5.2%
Other values (12) 72423
32.1%
Common
ValueCountFrequency (%)
2 150400
66.7%
- 75200
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 451200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 150400
33.3%
- 75200
16.7%
u 20489
 
4.5%
J 20024
 
4.4%
a 18785
 
4.2%
e 18120
 
4.0%
A 12994
 
2.9%
n 12847
 
2.8%
p 12759
 
2.8%
M 12747
 
2.8%
Other values (14) 96835
21.5%

Service Period
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)< 0.1%
Missing470
Missing (%)0.6%
Memory size7.1 MiB
Jul-22
7177 
Jun-22
6809 
Aug-22
6503 
May-22
6500 
Apr-22
6491 
Other values (7)
41720 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters451200
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDec-22
2nd rowNov-22
3rd rowOct-22
4th rowSep-22
5th rowAug-22

Common Values

ValueCountFrequency (%)
Jul-22 7177
9.5%
Jun-22 6809
9.0%
Aug-22 6503
8.6%
May-22 6500
8.6%
Apr-22 6491
8.6%
Sep-22 6268
8.3%
Mar-22 6247
8.3%
Feb-22 6056
8.0%
Jan-22 6038
8.0%
Oct-22 5878
7.8%
Other values (2) 11233
14.8%

Length

2023-12-20T13:46:43.180707image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
jul-22 7177
9.5%
jun-22 6809
9.1%
aug-22 6503
8.6%
may-22 6500
8.6%
apr-22 6491
8.6%
sep-22 6268
8.3%
mar-22 6247
8.3%
feb-22 6056
8.1%
jan-22 6038
8.0%
oct-22 5878
7.8%
Other values (2) 11233
14.9%

Most occurring characters

ValueCountFrequency (%)
2 150400
33.3%
- 75200
16.7%
u 20489
 
4.5%
J 20024
 
4.4%
a 18785
 
4.2%
e 18120
 
4.0%
A 12994
 
2.9%
n 12847
 
2.8%
p 12759
 
2.8%
M 12747
 
2.8%
Other values (14) 96835
21.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 150400
33.3%
Lowercase Letter 150400
33.3%
Dash Punctuation 75200
16.7%
Uppercase Letter 75200
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 20489
13.6%
a 18785
12.5%
e 18120
12.0%
n 12847
8.5%
p 12759
8.5%
r 12738
8.5%
c 11674
7.8%
l 7177
 
4.8%
g 6503
 
4.3%
y 6500
 
4.3%
Other values (4) 22808
15.2%
Uppercase Letter
ValueCountFrequency (%)
J 20024
26.6%
A 12994
17.3%
M 12747
17.0%
S 6268
 
8.3%
F 6056
 
8.1%
O 5878
 
7.8%
D 5796
 
7.7%
N 5437
 
7.2%
Decimal Number
ValueCountFrequency (%)
2 150400
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 75200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 225600
50.0%
Latin 225600
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 20489
 
9.1%
J 20024
 
8.9%
a 18785
 
8.3%
e 18120
 
8.0%
A 12994
 
5.8%
n 12847
 
5.7%
p 12759
 
5.7%
M 12747
 
5.7%
r 12738
 
5.6%
c 11674
 
5.2%
Other values (12) 72423
32.1%
Common
ValueCountFrequency (%)
2 150400
66.7%
- 75200
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 451200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 150400
33.3%
- 75200
16.7%
u 20489
 
4.5%
J 20024
 
4.4%
a 18785
 
4.2%
e 18120
 
4.0%
A 12994
 
2.9%
n 12847
 
2.8%
p 12759
 
2.8%
M 12747
 
2.8%
Other values (14) 96835
21.5%

Original Volume
Real number (ℝ)

HIGH CORRELATION 

Distinct570
Distinct (%)0.8%
Missing470
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean33.99846
Minimum0
Maximum270
Zeros320
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size3.2 MiB
2023-12-20T13:46:43.807655image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.21
Q116.46
median24.57
Q336.56
95-th percentile97.33
Maximum270
Range270
Interquartile range (IQR)20.1

Descriptive statistics

Standard deviation31.150554
Coefficient of variation (CV)0.91623426
Kurtosis9.6204505
Mean33.99846
Median Absolute Deviation (MAD)9.32
Skewness2.5208257
Sum2556684.2
Variance970.35701
MonotonicityNot monotonic
2023-12-20T13:46:44.172119image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.64 756
 
1.0%
29.73 512
 
0.7%
3.81 466
 
0.6%
14 466
 
0.6%
105.32 466
 
0.6%
118.73 466
 
0.6%
92.56 466
 
0.6%
112.69 466
 
0.6%
89.82 466
 
0.6%
110.76 466
 
0.6%
Other values (560) 70204
92.8%
(Missing) 470
 
0.6%
ValueCountFrequency (%)
0 320
0.4%
0.31 83
 
0.1%
0.33 5
 
< 0.1%
0.41 9
 
< 0.1%
0.47 7
 
< 0.1%
0.57 8
 
< 0.1%
0.58 11
 
< 0.1%
0.63 5
 
< 0.1%
0.64 67
 
0.1%
0.86 16
 
< 0.1%
ValueCountFrequency (%)
270 115
 
0.2%
220 115
 
0.2%
200 115
 
0.2%
190 92
 
0.1%
180 115
 
0.2%
140 115
 
0.2%
136 330
0.4%
120 115
 
0.2%
119.65 466
0.6%
118.73 466
0.6%

UOM for Volume
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing470
Missing (%)0.6%
Memory size6.8 MiB
TN
71412 
YD
 
3198
Other Quantity
 
590

Length

Max length14
Median length2
Mean length2.0941489
Min length2

Characters and Unicode

Total characters157480
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTN
2nd rowTN
3rd rowTN
4th rowTN
5th rowTN

Common Values

ValueCountFrequency (%)
TN 71412
94.4%
YD 3198
 
4.2%
Other Quantity 590
 
0.8%
(Missing) 470
 
0.6%

Length

2023-12-20T13:46:44.493033image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:44.838576image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
tn 71412
94.2%
yd 3198
 
4.2%
other 590
 
0.8%
quantity 590
 
0.8%

Most occurring characters

ValueCountFrequency (%)
T 71412
45.3%
N 71412
45.3%
Y 3198
 
2.0%
D 3198
 
2.0%
t 1770
 
1.1%
O 590
 
0.4%
h 590
 
0.4%
e 590
 
0.4%
r 590
 
0.4%
590
 
0.4%
Other values (6) 3540
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 150400
95.5%
Lowercase Letter 6490
 
4.1%
Space Separator 590
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1770
27.3%
h 590
 
9.1%
e 590
 
9.1%
r 590
 
9.1%
u 590
 
9.1%
a 590
 
9.1%
n 590
 
9.1%
i 590
 
9.1%
y 590
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
T 71412
47.5%
N 71412
47.5%
Y 3198
 
2.1%
D 3198
 
2.1%
O 590
 
0.4%
Q 590
 
0.4%
Space Separator
ValueCountFrequency (%)
590
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 156890
99.6%
Common 590
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 71412
45.5%
N 71412
45.5%
Y 3198
 
2.0%
D 3198
 
2.0%
t 1770
 
1.1%
O 590
 
0.4%
h 590
 
0.4%
e 590
 
0.4%
r 590
 
0.4%
Q 590
 
0.4%
Other values (5) 2950
 
1.9%
Common
ValueCountFrequency (%)
590
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 157480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 71412
45.3%
N 71412
45.3%
Y 3198
 
2.0%
D 3198
 
2.0%
t 1770
 
1.1%
O 590
 
0.4%
h 590
 
0.4%
e 590
 
0.4%
r 590
 
0.4%
590
 
0.4%
Other values (6) 3540
 
2.2%

Pounds per Yard
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing470
Missing (%)0.6%
Memory size7.1 MiB
300.0
46399 
100.0
27830 
50.0
 
971

Length

Max length5
Median length5
Mean length4.9870878
Min length4

Characters and Unicode

Total characters375029
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100.0
2nd row100.0
3rd row100.0
4th row100.0
5th row100.0

Common Values

ValueCountFrequency (%)
300.0 46399
61.3%
100.0 27830
36.8%
50.0 971
 
1.3%
(Missing) 470
 
0.6%

Length

2023-12-20T13:46:45.143013image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-20T13:46:45.523017image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
ValueCountFrequency (%)
300.0 46399
61.7%
100.0 27830
37.0%
50.0 971
 
1.3%

Most occurring characters

ValueCountFrequency (%)
0 224629
59.9%
. 75200
 
20.1%
3 46399
 
12.4%
1 27830
 
7.4%
5 971
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 299829
79.9%
Other Punctuation 75200
 
20.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 224629
74.9%
3 46399
 
15.5%
1 27830
 
9.3%
5 971
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 75200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 375029
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 224629
59.9%
. 75200
 
20.1%
3 46399
 
12.4%
1 27830
 
7.4%
5 971
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 375029
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 224629
59.9%
. 75200
 
20.1%
3 46399
 
12.4%
1 27830
 
7.4%
5 971
 
0.3%

Trash Tons
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct562
Distinct (%)0.7%
Missing470
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean30.944807
Minimum0
Maximum136
Zeros1772
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size3.2 MiB
2023-12-20T13:46:45.813060image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.66
Q115.14
median23.83
Q334.07
95-th percentile92.56
Maximum136
Range136
Interquartile range (IQR)18.93

Descriptive statistics

Standard deviation26.662949
Coefficient of variation (CV)0.86162919
Kurtosis2.5196267
Mean30.944807
Median Absolute Deviation (MAD)9.37
Skewness1.6611463
Sum2327049.5
Variance710.91283
MonotonicityNot monotonic
2023-12-20T13:46:46.064730image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1772
 
2.3%
29.73 512
 
0.7%
73.44 466
 
0.6%
3.81 466
 
0.6%
94.83 466
 
0.6%
105.32 466
 
0.6%
118.73 466
 
0.6%
92.56 466
 
0.6%
112.69 466
 
0.6%
89.82 466
 
0.6%
Other values (552) 69188
91.4%
(Missing) 470
 
0.6%
ValueCountFrequency (%)
0 1772
2.3%
0.31 83
 
0.1%
0.33 5
 
< 0.1%
0.41 9
 
< 0.1%
0.47 7
 
< 0.1%
0.57 8
 
< 0.1%
0.58 11
 
< 0.1%
0.63 5
 
< 0.1%
0.64 67
 
0.1%
0.8 235
 
0.3%
ValueCountFrequency (%)
136 330
0.4%
119.65 466
0.6%
118.73 466
0.6%
112.69 466
0.6%
110.76 466
0.6%
105.32 466
0.6%
97.33 456
0.6%
94.83 466
0.6%
92.56 466
0.6%
89.82 466
0.6%

Recycle Tons
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct6
Distinct (%)< 0.1%
Missing470
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean0.011180984
Minimum0
Maximum5.4
Zeros74338
Zeros (%)98.2%
Negative0
Negative (%)0.0%
Memory size3.2 MiB
2023-12-20T13:46:46.240756image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum5.4
Range5.4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1346087
Coefficient of variation (CV)12.039074
Kurtosis697.17798
Mean0.011180984
Median Absolute Deviation (MAD)0
Skewness22.471008
Sum840.81
Variance0.018119502
MonotonicityNot monotonic
2023-12-20T13:46:46.418204image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 74338
98.2%
0.87 756
 
1.0%
0.11 67
 
0.1%
4.2 16
 
< 0.1%
4.42 16
 
< 0.1%
5.4 7
 
< 0.1%
(Missing) 470
 
0.6%
ValueCountFrequency (%)
0 74338
98.2%
0.11 67
 
0.1%
0.87 756
 
1.0%
4.2 16
 
< 0.1%
4.42 16
 
< 0.1%
5.4 7
 
< 0.1%
ValueCountFrequency (%)
5.4 7
 
< 0.1%
4.42 16
 
< 0.1%
4.2 16
 
< 0.1%
0.87 756
 
1.0%
0.11 67
 
0.1%
0 74338
98.2%

Tons Derived
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing470
Missing (%)0.6%
Memory size2.7 MiB
False
72002 
True
 
3198
(Missing)
 
470
ValueCountFrequency (%)
False 72002
95.2%
True 3198
 
4.2%
(Missing) 470
 
0.6%
2023-12-20T13:46:46.656799image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Interactions

2023-12-20T13:46:27.272539image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:19.978651image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:21.011448image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:22.527420image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:23.543923image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:24.583517image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:25.859446image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:27.523076image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:20.126407image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:21.173858image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:22.674432image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:23.687502image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:24.732351image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:26.029332image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:27.775019image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:20.275562image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:21.739634image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:22.815393image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:23.834710image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:24.890186image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:26.370941image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:27.985461image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:20.408694image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:21.879901image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:22.944829image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:23.976731image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:25.160039image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:26.529681image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:28.291370image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:20.564340image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:22.045805image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:23.108561image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:24.132648image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:25.352330image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:26.703557image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:28.510167image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:20.698175image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:22.196572image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:23.260973image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:24.276660image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:25.514175image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:26.879606image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:28.726392image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:20.851722image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:22.371837image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:23.407661image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:24.440488image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:25.696075image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
2023-12-20T13:46:27.080594image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/

Correlations

2023-12-20T13:46:46.851797image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
AddressAsset CategoryAsset TypeCharge Code Desc.Charge GroupCharge Sub-TypeCityContainer SizeContainer TypeInvoice NumberLocation IDOriginal VolumePounds per YardQuantityRecycle TonsReport PeriodService PeriodStateTons DerivedTrash TonsUOM for VolumeUnit of MeasureWaste StreamWaste TypeZip Code
Address1.0000.4040.6240.3630.3020.5170.9660.0800.4640.0010.9850.1310.584-0.0430.0840.0760.0761.0000.4210.0990.3540.2830.4340.443-0.025
Asset Category0.4041.0000.0860.5650.4750.3620.4040.4851.000-0.0190.4030.0160.6840.114-0.6310.0000.0000.1550.7710.2680.7710.1310.7060.722-0.054
Asset Type0.6240.0861.0000.2000.1420.2910.596-0.1700.1270.0110.619-0.1440.748-0.001-0.0510.0650.0650.3250.060-0.1940.0780.1500.0570.394-0.141
Charge Code Desc.0.3630.5650.2001.0001.0001.0000.360-0.0120.392-0.0250.3230.0030.386-0.4160.0580.0090.0090.3060.433-0.0280.3130.6670.4680.293-0.001
Charge Group0.3020.4750.1421.0001.0001.0000.299-0.0770.2570.0130.301-0.1550.204-0.5750.1050.0110.0110.1810.358-0.1940.2560.6530.2160.1490.127
Charge Sub-Type0.5170.3620.2911.0001.0001.0000.5140.0470.261-0.0260.458-0.1110.349-0.038-0.0260.0000.0000.3450.320-0.1070.2640.4150.3600.265-0.042
City0.9660.4040.5960.3600.2990.5141.0000.0660.4640.0130.949-0.0300.5710.043-0.1200.0710.0711.0000.417-0.0050.3520.2770.4340.440-0.285
Container Size0.0800.485-0.170-0.012-0.0770.0470.0661.0000.577-0.0630.4670.2420.6130.106-0.3080.0510.0510.3800.7710.3550.5460.0990.7810.522-0.117
Container Type0.4641.0000.1270.3920.2570.2610.4640.5771.000-0.0130.4740.0010.6680.098-0.4610.0180.0180.1350.7710.1930.5450.0960.9400.716-0.059
Invoice Number0.001-0.0190.011-0.0250.013-0.0260.013-0.063-0.0131.0000.149-0.0250.0730.0160.0090.0350.0350.0740.016-0.0150.0660.0690.0410.0660.001
Location ID0.9850.4030.6190.3230.3010.4580.9490.4670.4740.1491.000-0.0110.5810.0140.0870.0620.0621.0000.420-0.0910.3540.2800.4340.4380.479
Original Volume0.1310.016-0.1440.003-0.155-0.111-0.0300.2420.001-0.025-0.0111.0000.2570.0680.0540.1630.1630.2400.4230.8800.3040.2050.1350.223-0.300
Pounds per Yard0.5840.6840.7480.3860.2040.3490.5710.6130.6680.0730.5810.2571.0000.065-0.2110.0350.0350.2380.5500.2580.3910.1020.9560.703-0.087
Quantity-0.0430.114-0.001-0.416-0.575-0.0380.0430.1060.0980.0160.0140.0680.0651.000-0.0740.0210.0210.1520.2130.0830.1520.8400.0630.190-0.010
Recycle Tons0.084-0.631-0.0510.0580.105-0.026-0.120-0.308-0.4610.0090.0870.054-0.211-0.0741.0000.0360.0360.0820.478-0.1820.3380.0580.8640.7070.071
Report Period0.0760.0000.0650.0090.0110.0000.0710.0510.0180.0350.0620.1630.0350.0210.0361.0001.0000.0400.0590.0470.1610.0420.0000.0530.012
Service Period0.0760.0000.0650.0090.0110.0000.0710.0510.0180.0350.0620.1630.0350.0210.0361.0001.0000.0400.0590.0470.1610.0420.0000.0530.012
State1.0000.1550.3250.3060.1810.3451.0000.3800.1350.0741.0000.2400.2380.1520.0820.0400.0401.0000.303-0.1090.2380.2560.1210.351-0.080
Tons Derived0.4210.7710.0600.4330.3580.3200.4170.7710.7710.0160.4200.4230.5500.2130.4780.0590.0590.3031.000-0.3171.0000.2300.5460.6410.030
Trash Tons0.0990.268-0.194-0.028-0.194-0.107-0.0050.3550.193-0.015-0.0910.8800.2580.083-0.1820.0470.047-0.109-0.3171.0000.3000.1420.2200.213-0.297
UOM for Volume0.3540.7710.0780.3130.2560.2640.3520.5460.5450.0660.3540.3040.3910.1520.3380.1610.1610.2381.0000.3001.0000.1630.5460.4530.041
Unit of Measure0.2830.1310.1500.6670.6530.4150.2770.0990.0960.0690.2800.2050.1020.8400.0580.0420.0420.2560.2300.1420.1631.0000.0940.273-0.000
Waste Stream0.4340.7060.0570.4680.2160.3600.4340.7810.9400.0410.4340.1350.9560.0630.8640.0000.0000.1210.5460.2200.5460.0941.0000.985-0.044
Waste Type0.4430.7220.3940.2930.1490.2650.4400.5220.7160.0660.4380.2230.7030.1900.7070.0530.0530.3510.6410.2130.4530.2730.9851.0000.054
Zip Code-0.025-0.054-0.141-0.0010.127-0.042-0.285-0.117-0.0590.0010.479-0.300-0.087-0.0100.0710.0120.012-0.0800.030-0.2970.041-0.000-0.0440.0541.000

Missing values

2023-12-20T13:46:29.351843image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-20T13:46:30.293202image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-20T13:46:31.195288image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Location IDInvoice NumberAccountAsset IDLocationAddressCityStateZip CodeContainer TypeContainer SizeCompactorAsset TypeAsset CategoryWaste StreamWaste TypeReport DateCharge DateCharge AmountCharge Code Desc.From DateTo DateQuantityUnit of MeasureCharge GroupCharge Sub-TypeReport PeriodService PeriodOriginal VolumeUOM for VolumePounds per YardTrash TonsRecycle TonsTons Derived
010120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/31/2022$77.25CONTAINER/COMPACTOR RENTAL1/1/20231/31/20231.00EAMISCRENTDec-22Dec-2217.16TN100.017.160.0N
110120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/31/2022$77.25CONTAINER/COMPACTOR RENTAL1/1/20231/31/20231.00EAMISCRENTNov-22Nov-2215.67TN100.015.670.0N
210120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/31/2022$77.25CONTAINER/COMPACTOR RENTAL1/1/20231/31/20231.00EAMISCRENTOct-22Oct-2220.21TN100.020.210.0N
310120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/31/2022$77.25CONTAINER/COMPACTOR RENTAL1/1/20231/31/20231.00EAMISCRENTSep-22Sep-2214.61TN100.014.610.0N
410120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/31/2022$77.25CONTAINER/COMPACTOR RENTAL1/1/20231/31/20231.00EAMISCRENTAug-22Aug-2213.16TN100.013.160.0N
510120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/1/2022$95.86DISPOSAL CHARGE12/1/202212/1/20221.77TNDSPNaNDec-22Dec-2217.16TN100.017.160.0N
610120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/1/2022$95.86DISPOSAL CHARGE12/1/202212/1/20221.77TNDSPNaNNov-22Nov-2215.67TN100.015.670.0N
710120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/1/2022$95.86DISPOSAL CHARGE12/1/202212/1/20221.77TNDSPNaNOct-22Oct-2220.21TN100.020.210.0N
810120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/1/2022$95.86DISPOSAL CHARGE12/1/202212/1/20221.77TNDSPNaNSep-22Sep-2214.61TN100.014.610.0N
910120221200800-0303937800-0303937-00001-01FLOOR & DECOR 101 30YD CLEANUP1690 NORTHEAST EXPY NEBROOKHAVENGA30329RO30NaNTEMPORARYIndustrialTrashNONE12/1/202212/1/2022$95.86DISPOSAL CHARGE12/1/202212/1/20221.77TNDSPNaNAug-22Aug-2213.16TN100.013.160.0N
Location IDInvoice NumberAccountAsset IDLocationAddressCityStateZip CodeContainer TypeContainer SizeCompactorAsset TypeAsset CategoryWaste StreamWaste TypeReport DateCharge DateCharge AmountCharge Code Desc.From DateTo DateQuantityUnit of MeasureCharge GroupCharge Sub-TypeReport PeriodService PeriodOriginal VolumeUOM for VolumePounds per YardTrash TonsRecycle TonsTons Derived
75660DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATSep-22Sep-2228.84TN100.028.840.0N
75661DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATAug-22Aug-2252.83TN100.052.830.0N
75662DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATJul-22Jul-2258.58TN100.058.580.0N
75663DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATJun-22Jun-2256.12TN100.056.120.0N
75664DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATMay-22May-2270.07TN100.070.070.0N
75665DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATApr-22Apr-2260.74TN100.060.740.0N
75666DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATMar-22Mar-2248.04TN100.048.040.0N
75667DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATFeb-22Feb-2252.36TN100.052.360.0N
75668DC 99120220100853-0136012853-0136012-00001-01FLOOR AND DECOR DC 991 40YD5120 CEDAR PORT PKWYBAYTOWNTX77523RO40NaNPERMANENTIndustrialTrashNONE1/1/20221/4/2022$435.72STATE TAX1/4/20221/4/20222.81TNTAXSTATJan-22Jan-2229.55TN100.029.550.0N
75669GJTEMP1220220700625-8126942625-8126942-00001-01FLOOR & DECOR 188 GJ TEMP16300 W BLUEMOUND RDBROOKFIELDWI53005HP0NaNTEMPORARYCommercialTrashNONE7/1/20226/11/2022$1,673.41GOT JUNK6/11/20226/11/20221.00EAMISCNaNNaNNaNNaNNaNNaNNaNNaNNaN

Duplicate rows

Most frequently occurring

Location IDInvoice NumberAccountAsset IDLocationAddressCityStateZip CodeContainer TypeContainer SizeCompactorAsset TypeAsset CategoryWaste StreamWaste TypeReport DateCharge DateCharge AmountCharge Code Desc.From DateTo DateQuantityUnit of MeasureCharge GroupCharge Sub-TypeReport PeriodService PeriodOriginal VolumeUOM for VolumePounds per YardTrash TonsRecycle TonsTons Derived# duplicates
010220220900800-0304124800-0304124-00001-01FLOOR & DECOR 102 30YD CLOSING1056 PERSONAL PLMORROWGA30260RO30NaNTEMPORARYIndustrialTrashNONE9/1/20229/9/2022$175.00SERVICE ATTEMPT9/9/20229/9/20221.0EAMISCTRIPJul-22Jul-223.93TN100.03.930.0N3
110220221000800-0303428800-0303428-00001-01FLOOR & DECOR 102 C&D TEMP1056 PERSONAL PLMORROWGA30260RO30NaNTEMPORARYIndustrialTrashCONSTRUCTION/DEMOLITION DEBRIS10/1/202210/13/2022$160.44HAUL10/13/202210/13/20221.0EAHAULNaNJul-22Jul-2227.29TN100.027.290.0N3
210220221000800-0303428800-0303428-00001-01FLOOR & DECOR 102 C&D TEMP1056 PERSONAL PLMORROWGA30260RO30NaNTEMPORARYIndustrialTrashCONSTRUCTION/DEMOLITION DEBRIS10/1/202210/13/2022$160.44HAUL10/13/202210/13/20221.0EAHAULNaNOct-22Oct-2217.80TN100.017.800.0N3
310520220600853-0124694853-0124694-00001-01FLOOR & DECOR TEMP17211 NORTH FWYHOUSTONTX77090RO40NaNTEMPORARYIndustrialTrashNONE6/1/20226/29/2022$335.98HAUL6/29/20226/29/20221.0EAHAULNaNApr-22Apr-2219.76TN100.019.760.0N2
410520220600853-0124694853-0124694-00001-01FLOOR & DECOR TEMP17211 NORTH FWYHOUSTONTX77090RO40NaNTEMPORARYIndustrialTrashNONE6/1/20226/29/2022$335.98HAUL6/29/20226/29/20221.0EAHAULNaNAug-22Aug-2222.67TN100.022.670.0N2
510520220600853-0124694853-0124694-00001-01FLOOR & DECOR TEMP17211 NORTH FWYHOUSTONTX77090RO40NaNTEMPORARYIndustrialTrashNONE6/1/20226/29/2022$335.98HAUL6/29/20226/29/20221.0EAHAULNaNFeb-22Feb-227.40TN100.07.400.0N2
610520220600853-0124694853-0124694-00001-01FLOOR & DECOR TEMP17211 NORTH FWYHOUSTONTX77090RO40NaNTEMPORARYIndustrialTrashNONE6/1/20226/29/2022$335.98HAUL6/29/20226/29/20221.0EAHAULNaNJan-22Jan-2225.95TN100.025.950.0N2
710520220600853-0124694853-0124694-00001-01FLOOR & DECOR TEMP17211 NORTH FWYHOUSTONTX77090RO40NaNTEMPORARYIndustrialTrashNONE6/1/20226/29/2022$335.98HAUL6/29/20226/29/20221.0EAHAULNaNJul-22Jul-2213.88TN100.013.880.0N2
810520220600853-0124694853-0124694-00001-01FLOOR & DECOR TEMP17211 NORTH FWYHOUSTONTX77090RO40NaNTEMPORARYIndustrialTrashNONE6/1/20226/29/2022$335.98HAUL6/29/20226/29/20221.0EAHAULNaNJun-22Jun-2212.07TN100.012.070.0N2
910520220600853-0124694853-0124694-00001-01FLOOR & DECOR TEMP17211 NORTH FWYHOUSTONTX77090RO40NaNTEMPORARYIndustrialTrashNONE6/1/20226/29/2022$335.98HAUL6/29/20226/29/20221.0EAHAULNaNMar-22Mar-2213.59TN100.013.590.0N2